Estimation of glottal closure instants from telephone speech using a group delay-based approach that considers speech signal as a spectrum
نویسندگان
چکیده
Glottal closure instants (GCIs) are characterized by a strong negative valley in the speech signal and an abrupt change in the amplitude. In this paper, an algorithm that exploits these two properties of a GCI is proposed to estimate the location of GCIs, specifically from telephone speech. The algorithm considers a symmetrized voiced segment as the Fourier transform of an even signal. In such a case, the negative valleys in the spectrum correspond to zeros that lie outside the unit circle in the z-plane. The angular location of these zeros indicate the location of the GCIs. The angular location can be estimated from the group delay spectrum of the even signal, since a phase change of 2π, between adjacent frequency bins, occurs at the location of a zero that lies outside the unit circle. The performance of the algorithm is evaluated on a simulated speech corpora derived from CMU and CSTR databases and the NTIMIT database, in terms of identification, false alarm, and miss rates. The proposed algorithm is compared with DYPSA, YAGA, and SEDREAMS, and is found to outperform all the algorithms when used on telephone speech.
منابع مشابه
Voice source cepstrum processing for speaker identification
Voice source analysis and modelling has played a key role in important speech applications such as speech recognition, speech synthesis and speaker recognition. This work presents a robust algorithm for glottal closure detection and a novel set of voice source features for speaker recognition. In the rst part of the dissertation the DYPSA algorithm is developed for detecting glottal closure ins...
متن کاملProsodic manipulation using instants of significant excitation
This paper proposes a technique for prosodic (pitch and duration) manipulation using instants of significant excitation. Instants of significant excitation correspond to the instants of glottal closure (epochs) in voiced speech and to some random excitations like burst onset in the case of nonvoiced speech. Instants of significant excitation are computed from the average group delay of minimum ...
متن کاملExploring Bessel Features for Detection of Glottal Closure Instants
For voiced speech, the most significant excitation takes place around the instant of glottal closure. Glottal closure instants (GCI) information is useful for accurate speech analysis. In particular accurate spectrum analysis is performed by considering the speech in the intervals of glottal closure. In this paper we propose an approach for detection of GCI by exploring Bessel feature, and the ...
متن کاملClassification-Based Detection of Glottal Closure Instants from Speech Signals
In this paper a classification-based method for the automatic detection of glottal closure instants (GCIs) from the speech signal is proposed. Peaks in the speech waveforms are taken as candidates for GCI placements. A classification framework is used to train a classification model and to classify whether or not a peak corresponds to the GCI. We show that the detection accuracy in terms of F1 ...
متن کاملAutomatic pitch marking and reconstruction of glottal closure instants from noisy and deformed electro-glotto-graph signals
Pitch tracking and pitch marking (PM) are two important speech signal analysis techniques for several applications. The accuracy of both pitch marking and tracking is significant to generate smooth synthesized speech by controlling the pitch and duration of voiced speech in Text-to-Speech (TTS) system for example. In this paper, we present a novel hybrid approach, combining electro-glotto-graph...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015